<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta name="description" content="">
    <meta name="author" content="">
    <meta name="keyword" content="">
    <link rel="shortcut icon" href="/node-mongodb-native/img/favicon.png">

    <title>MongoDB Storage</title>

    <link href="/node-mongodb-native/css/bootstrap-theme.css" rel="stylesheet">
    <link href="/node-mongodb-native/assets/font-awesome/css/font-awesome.min.css" rel="stylesheet" />
    <link href="/node-mongodb-native/css/style.css" rel="stylesheet">
    <link href="/node-mongodb-native/css/style-responsive.css" rel="stylesheet" />
    <link href="/node-mongodb-native/css/monokai_sublime.css" rel="stylesheet" />

  </head>

  <body>

  <section id="container" class="">


      <header class="header black-bg">
            <div class="toggle-nav">
                <i class="fa fa-bars"></i>
                <div class="icon-reorder tooltips" data-original-title="Toggle Navigation" data-placement="bottom"></div>
            </div>


            <a href="/node-mongodb-native/" class="logo"><img src="/node-mongodb-native/img/logo-mongodb-header.png" style="height:40px;"></a>

            <div class="nav title-row" id="top_menu">
                <h1 class="nav top-menu"> MongoDB Storage </h1>
            </div>
      </header>



<aside>
    <div id="sidebar"  class="nav-collapse ">

        <ul class="sidebar-menu">




            <li class="sub-menu active">
            <a href="javascript:;" class="">
                <i class='fa fa-book'></i>

                <span>schema design</span>
                <span class="menu-arrow fa fa-angle-down"></span>
            </a>
                <ul class="sub open">

                    <li><a href="/node-mongodb-native/schema/chapter1/"> Introduction </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter2/"> Schema Basics </a> </li>

                    <li class="active"><a href="/node-mongodb-native/schema/chapter3/"> MongoDB Storage </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter4/"> Indexes </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter5/"> Metadata </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter6/"> Time Series </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter7/"> Queues </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter8/"> Nested Categories </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter9/"> Account Transactions </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter10/"> Shopping Cart </a> </li>

                    <li><a href="/node-mongodb-native/schema/chapter11/"> Sharding </a> </li>

                </ul>

              </li>


                <li>
                <a class="" href="https://mongodb.github.io/node-mongodb-native/2.2">
                    <i class='fa fa-file-text'></i>

                    <span>2.0 driver docs</span>
                </a>

              </li>


                <li>
                <a class="" href="https://mongodb.github.io/node-mongodb-native/2.2">
                    <i class='fa fa-file-text'></i>

                    <span>1.4 driver docs</span>
                </a>

              </li>


                <li>
                <a class="" href="https://mongodb.github.io/node-mongodb-native/2.2">
                    <i class='fa fa-file-text'></i>

                    <span>Core driver docs</span>
                </a>

              </li>

            <li> <a href="https://groups.google.com/forum/#!forum/learn-mongo-db-the-hardway" target="blank"><i class='fa fa-life-ring'></i>Issues & Help</a> </li>


        </ul>

    </div>
</aside>




      <section id="main-content">
          <section class="wrapper">













              <div class="row">
                  <div class="col-md-1">


                      <a class="navigation prev" href="/schema/chapter2">
                      <i class="fa fa-angle-left"></i>
                      </a>


                  </div>
                <div class="col-md-10">
                    <section class="panel">



                    <div class="panel-body">



<h1 id="mongodb-storage:b3a7a6b1d300a0da8d3b13d4a559ed84">MongoDB Storage</h1>

<p>To properly understand how a schema design impacts performance it&rsquo;s important to understand how MongoDB works under the covers.</p>

<h2 id="memory-mapped-files:b3a7a6b1d300a0da8d3b13d4a559ed84">Memory Mapped Files</h2>

<p>MongoDB uses memory-mapped files to store it&rsquo;s data (A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file).</p>

<p><img src="/images/originals/memory_mapping.png" alt="Memory Mapped Files" />
</p>

<p>Memory mapped files lets MongoDB delegate the handling of Virtual Memory to the operating system instead of explicitly managing memory itself. Since the Virtual Address Space is much larger than any physical RAM (Random Access Memory) installed in a computer there is contention about what parts of the Virtual Memory is kept in RAM at any given point in time. When the operating system runs out of RAM and an application requests something that&rsquo;s not currently in RAM it will swap out memory to disk to make space for the newly requested data. Most operating systems will do this using a Least Recently Used (LRU) strategy where the oldest data is swapped to disk first.</p>

<p>When reading up on MongoDB you&rsquo;ll most likely run into the word &ldquo;Working Set&rdquo;. This is the data that your application is constantly requesting. If your &ldquo;Working Set&rdquo; all fits in RAM then all access will be fast as the operating system will not have to swap to and from disk as much. However if your &ldquo;Working Set&rdquo; does not fit in RAM you suffer performance penalties as the operating system needs to swap one part of your &ldquo;Working Set&rdquo; to disk to access another part of it.</p>

<blockquote>
<p><strong>Determine if the Working Set is to big</strong></p>

<p>You can get an indication of if your working set fits in memory by looking at the number of page faults over time. If it&rsquo;s rapidly increasing it might mean your Working Set does not fit in memory.</p>
</blockquote>

<pre><code class="language-bash">&gt;   use mydb
&gt;   db.serverStatus().extra_info.page_faults
</code></pre>

<p>This is usually a sign that it&rsquo;s time to consider either increasing the amount of RAM in your machine or to shard your MongoDB system so more of your &ldquo;Working Set&rdquo; can be kept in memory (sharding splits your &ldquo;Working Set&rdquo; across multiple machines RAM resources).</p>

<h2 id="padding:b3a7a6b1d300a0da8d3b13d4a559ed84">Padding</h2>

<p>Another important aspect to understand with MongoDB is how documents physically grow in the database. Let&rsquo;s take the simple document example below.</p>

<pre><code class="language-json">{
  &quot;hello&quot;: &quot;world&quot;
}
</code></pre>

<p>If we add a new field named <em>name</em> to the document</p>

<pre><code class="language-json">{
  &quot;hello&quot;: &quot;world&quot;,
  &quot;name&quot;: &quot;Christian&quot;
}
</code></pre>

<p>The document will grow in size. If MongoDB was naively implemented it would now need to move the document to a new bigger space as it would have outgrown it&rsquo;s originally allocated space.</p>

<p>However MongoDB stored the original document it added a bit of empty space at the end of the document hence referred to as <strong>padding</strong>. The reason for this padding is that MongoDB expects the document to grow in size over time. As long as this document growth stays inside the additional <strong>padding</strong> space MongoDB does not need to move the document to a new bigger space thus avoiding the cost of copying bytes around in memory, and on disk.</p>

<p><img src="/images/originals/document_with_padding.png" alt="Document With Padding" />
</p>

<p>Over time the <strong>padding factor</strong> that governs how much extra space is appended to a document inserted into MongoDB changes as the database attempts to find the balance between the eventual size of documents and the unused space take up by the <em>padding</em>. However if the growth of individual documents is random MongoDB will not be able to correctly <strong>Pre-Allocate</strong> the right level of <em>padding</em> and the database might end up spending a lot of time copying documents around in memory and on disk instead of performing application specific work causing an impact on write performance.</p>

<blockquote>
<h2 id="how-to-determine-the-padding-factor:b3a7a6b1d300a0da8d3b13d4a559ed84">How to determine the padding factor</h2>

<p>You can determine the <em>padding</em> factor for a specific collection in the following way</p>
</blockquote>

<pre><code class="language-bash">&gt;   use mydb
&gt;   db.my_collection.stats()
</code></pre>

<blockquote>
<p>The returned result contains a field <strong>paddingFactor</strong>. The value tells you how much padding is added. A value of 1 means no padding added a value of 2 means the padding is the same size as the document size.</p>
</blockquote>

<p>A <strong>padding factor</strong> of 1 is usually a sign that the database is spending most of it&rsquo;s time writing new data to memory and disk instead of moving existing data. Having said that one has to take into account the scale of the writing operations. If you have only a 1000 documents in a collection it might not matter if you&rsquo;re <strong>padding factor</strong> is closer to 2. On the other hand if you are writing massive amounts of time series data the impact of moving documents around in memory and on disk might have a severe impact on your performance.</p>

<h2 id="fragmentation:b3a7a6b1d300a0da8d3b13d4a559ed84">Fragmentation</h2>

<p>When documents move around or are removed they leave holes. MongoDB tries to reuse these holes for new documents when ever possible, but over time it will slowly and steadily find itself having a lot of holes that cannot be reused because documents cannot fit in them. This effect is called fragmentation and is common in all systems that allocate memory including your operating system.</p>

<p><img src="/images/originals/fragmentation.png" alt="Document With Padding" />
</p>

<p>The effect of fragmentation is to waste space. Due to the fact that MongoDB uses memory mapped files any fragmentation on disk will be reflected in fragmentation in RAM as well. This has the effect of making less of the &ldquo;Working Set&rdquo; fit in RAM and causing more swapping to disk.</p>

<blockquote>
<h2 id="how-to-determine-the-fragmentation:b3a7a6b1d300a0da8d3b13d4a559ed84">How to determine the fragmentation</h2>

<p>You can get a good indication of fragmentation by</p>
</blockquote>

<pre><code class="language-bash">&gt;   use mydb
&gt;   var s = db.my_collection.stats()
&gt;   var frag = s.storageSize / (s.size + s.totalIndexSize)
</code></pre>

<blockquote>
<p>A <strong>frag</strong> value larger than 1 indicates some level of fragmentation</p>
</blockquote>

<p>There are three main ways of avoiding or limiting fragmentation for your MongoDB data.</p>

<p>The first one is to use the <strong>compact</strong> command on MongoDB to rewrite the data and thus remove the fragmentation. Unfortunately as of 2.6 <strong>compact</strong> is an off-line operation meaning that the database has to be taking out of production for the duration of the <strong>compact</strong> operation</p>

<p>The second option is to use the <strong>usePowerOf2Sizes</strong> option to make MongoDB allocate memory in powers of 2. So instead of allocating memory to fit a specific document MongoDB allocates only in powers of 2 (128 bytes, 256 bytes, 512 bytes, 1024 bytes and so forth). This means there is less chance of a hole not being reused as it will always be a standard size. However it does increase the likeliness of wasted space as a document that is 257 bytes long will occupy a 512 bytes big allocation.</p>

<blockquote>
<p>As of <strong>2.6</strong> <strong>usePowerOf2Sizes</strong> is the default allocation strategy for collections.</p>
</blockquote>

<p>The third and somewhat harder option is to consider fragmentation in your schema design. The application can model it&rsquo;s documents to minimize fragmentation doing such things as pre-allocating the max size of a document and ensuring document size growth is managed correctly. Some of the patterns in this book will discuss aspects of this.</p>



                <div class="col-md-1">


                    <a class="navigation next" href="/schema/chapter4">
                        <i class="fa fa-angle-right"> </i>
                    </a>


                </div>
              </div>

          </section>
      </section>

  </section>


    <script src="/node-mongodb-native/js/jquery-2.1.1.min.js"></script>
    <script src="/node-mongodb-native/js/jquery.scrollTo.min.js"></script>
    <script src="/node-mongodb-native/js/bootstrap.min.js"></script>

    <script src="/node-mongodb-native/js/highlight.pack.js"></script>
    <script>hljs.initHighlightingOnLoad();</script>
    <script src="/node-mongodb-native/js/scripts.js"></script>
    <script async defer id="github-bjs" src="/node-mongodb-native/js/buttons.js"></script>
    <script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-41953057-1', 'auto');
  ga('send', 'pageview');

</script>
  </body>
</html>
